5 research outputs found

    A Survey on Differential Privacy with Machine Learning and Future Outlook

    Full text link
    Nowadays, machine learning models and applications have become increasingly pervasive. With this rapid increase in the development and employment of machine learning models, a concern regarding privacy has risen. Thus, there is a legitimate need to protect the data from leaking and from any attacks. One of the strongest and most prevalent privacy models that can be used to protect machine learning models from any attacks and vulnerabilities is differential privacy (DP). DP is strict and rigid definition of privacy, where it can guarantee that an adversary is not capable to reliably predict if a specific participant is included in the dataset or not. It works by injecting a noise to the data whether to the inputs, the outputs, the ground truth labels, the objective functions, or even to the gradients to alleviate the privacy issue and protect the data. To this end, this survey paper presents different differentially private machine learning algorithms categorized into two main categories (traditional machine learning models vs. deep learning models). Moreover, future research directions for differential privacy with machine learning algorithms are outlined.Comment: 12 pages, 3 figure

    Text to Image Synthesis via Mask Anchor Points and Aesthetic Assessment

    No full text
    Text-to-image is a process of generating an image from the input text. It has a variety of applications in art generation, computer-aided design, and photo-editing. In this thesis, we propose a new framework that leverages mask anchor points to incorporate two major steps in the image synthesis. In the first step, the mask image is generated from the input text and the mask dataset. In the second step, the mask image is fed into the state-of-the-art mask-to-image generator. Note that the mask image captures the semantic information and the location relationship via the anchor points. We develop a user-friendly interface that helps parse the input text into the meaningful semantic objects. However, to synthesize an appealing image from the text, image aesthetics criteria should be considered. Therefore, we further improve our proposed framework by incorporating the aesthetic assessment from photography composition rules. To this end, we randomize a set of mask maps from the input text via the anchor point-based mask map generator, and then we compute and rank the image aesthetics score for all generated mask maps following two composition rules, namely, the rule of thirds along with the rule of formal balance. In the next stage, we feed the subset of the mask maps, which are the highest, lowest, and the average aesthetic scores, into the state-of-the-art mask-to-image generator via image generator. The photorealistic images are further re-ranked to obtain the synthesized image with the highest aesthetic score. Thus, to overcome the state-of-the-arts generated images\u27 problems such as the un-naturality, the ambiguity, and the distortion, we propose a new framework. Our framework maintains the clarity of the entities\u27 shape, the details of the entity edges, and the proper layout no matter how complex the input text is and how many entities and spatial relations in the text. Our contribution is converting the input text to an appropriate constructed mask map or to a set of mask maps via Mask Map Generator (MG). Furthermore, the aesthetic assessment is part of our contribution in this study via Aesthetic Ranking (AR) component. The experiments on the most challenging COCO-stuff dataset illustrates the superiority of our proposed approach over the previous state of the arts

    Olympic Games Event Recognition via Transfer Learning with Photobombing Guided Data Augmentation

    No full text
    Automatic event recognition in sports photos is both an interesting and valuable research topic in the field of computer vision and deep learning. With the rapid increase and the explosive spread of data, which is being captured momentarily, the need for fast and precise access to the right information has become a challenging task with considerable importance for multiple practical applications, i.e., sports image and video search, sport data analysis, healthcare monitoring applications, monitoring and surveillance systems for indoor and outdoor activities, and video captioning. In this paper, we evaluate different deep learning models in recognizing and interpreting the sport events in the Olympic Games. To this end, we collect a dataset dubbed Olympic Games Event Image Dataset (OGED) including 10 different sport events scheduled for the Olympic Games Tokyo 2020. Then, the transfer learning is applied on three popular deep convolutional neural network architectures, namely, AlexNet, VGG-16 and ResNet-50 along with various data augmentation methods. Extensive experiments show that ResNet-50 with the proposed photobombing guided data augmentation achieves 90% in terms of accuracy

    Olympic Games Event Recognition via Transfer Learning with Photobombing Guided Data Augmentation

    No full text
    Automatic event recognition in sports photos is both an interesting and valuable research topic in the field of computer vision and deep learning. With the rapid increase and the explosive spread of data, which is being captured momentarily, the need for fast and precise access to the right information has become a challenging task with considerable importance for multiple practical applications, i.e., sports image and video search, sport data analysis, healthcare monitoring applications, monitoring and surveillance systems for indoor and outdoor activities, and video captioning. In this paper, we evaluate different deep learning models in recognizing and interpreting the sport events in the Olympic Games. To this end, we collect a dataset dubbed Olympic Games Event Image Dataset (OGED) including 10 different sport events scheduled for the Olympic Games Tokyo 2020. Then, the transfer learning is applied on three popular deep convolutional neural network architectures, namely, AlexNet, VGG-16 and ResNet-50 along with various data augmentation methods. Extensive experiments show that ResNet-50 with the proposed photobombing guided data augmentation achieves 90% in terms of accuracy

    Real Estate Pricing Prediction via Textual and Visual Features

    No full text
    The real estate industry relies heavily on accurately predicting the price of a house based on numerous factors such as size, location, amenities, and season. In this study, we explore the use of machine learning techniques for predicting house prices by considering both visual cues and estate attributes. We collected a dataset (REPD-3000) of 3000 houses across 74 cities in the USA and annotated 14 estate attributes and five visual images for each house\u27s exterior, interior-living room, kitchen, bedroom, and bathroom. We extracted features from the input images using convolutional neural network (CNN) and fed them along with the estate attributes into a multi-kernel deep learning regression model to predict the house price. Our model outperformed baseline models in extensive experiments, achieving the best result with a mean absolute error (MAE) of 16.60. We compared our model with a multi-kernel support vector regression and analyzed the impact of incorporating individual feature sets. In future, we plan to address class imbalance by having the same number of houses in each class and explore feature engineering for improving the model\u27s performance
    corecore